Speeding up ResNet training

نویسنده

  • Daniel Kang
چکیده

Time required for model training is an important limiting factor for faster pace of progress in the field of deep learning. The faster the model training, the more options researchers are able to try in the same amount of time, and the higher the quality of their results. In this work we stacked a set of techniques to optimize training time of the ResNet model with 20 layers and achieved a substantial speed up relative to the baseline. Our best stacked model trains about 5 times faster than the baseline model.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Deep Resnet Blocks Sequentially

We prove a multiclass boosting theory for the ResNet architectures which simultaneously creates a new technique for multiclass boosting and provides a new algorithm for ResNet-style architectures. Our proposed training algorithm, BoostResNet, is particularly suitable in non-differentiable architectures. Our method only requires the relatively inexpensive sequential training of T “shallow ResNet...

متن کامل

Scale out for large minibatch SGD: Residual network training on ImageNet-1K with improved accuracy and reduced time to train

For the past 5 years, the ILSVRC competition and the ImageNet dataset have attracted a lot of interest from the Computer Vision community, allowing for state-of-the-art accuracy to grow tremendously. This should be credited to the use of deep artificial neural network designs. As these became more complex, the storage, bandwidth, and compute requirements increased. This means that with a non-di...

متن کامل

Learning Deep ResNet Blocks Sequentially using Boosting Theory

Deep neural networks are known to be difficult to train due to the instability of back-propagation. A deep residual network (ResNet) with identity loops remedies this by stabilizing gradient computations. We prove a boosting theory for the ResNet architecture. We construct T weak module classifiers, each contains two of the T layers, such that the combined strong learner is a ResNet. Therefore,...

متن کامل

Supplementary Material for “DualNet: Learn Complementary Features for Image Recognition”

Besides ResNet-20, we further evaluate DualNet based on the deeper ResNet [6], e.g., with 32 layers and 56 layers (denoted as ResNet-32&ResNet-56, referring to the third-party implementation available at [2]). ResNet32&ResNet-56, as well as the corresponding DualNet (denoted as DNR32&DNR56), are also trained on the augmented CIFAR-100 and the experimental results are shown in Table 1. The perfo...

متن کامل

Neumann Optimizer: A Practical Optimization Algorithm for Deep Neural Networks

Progress in deep learning is slowed by the days or weeks it takes to train large models. The natural solution of using more hardware is limited by diminishing returns, and leads to inefficient use of additional resources. In this paper, we present a large batch, stochastic optimization algorithm that is both faster than widely used algorithms for fixed amounts of computation, and also scales up...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017